###Introduction
Durning last internship, I use apriori to improved the recommendation system of my company, KKday. KKday is the leading e-commerce travel platform in asia. In this article I am going to use the sales data from KKday to illustrate the performance and difference of apriori and distance-based recommendation system.
### Data Preprocessing
require(dplyr)
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
require('DT')
## Loading required package: DT
head(data,10)
## X prod_oid user_id
## 1 1 8332 132591
## 2 2 3598 132591
## 3 3 8332 58804
## 4 4 12808 58804
## 5 5 18073 55631
## 6 6 9987 55631
## 7 7 17772 55631
## 8 8 12049 142657
## 9 9 17756 142657
## 10 10 7505 198893
The data contains products which has been ordered by users. ‘product_oid’ is the code of each product, while user_id is user who bought the product. For example, we can say user ‘132591’ has bought product ‘8332’ and ‘3598’ together.
df <- data %>%
group_by(user_id) %>%
summarise(prod_oid_paste = paste(prod_oid, collapse=" "),
n = n()) %>% filter(n >1) #remove order that only contain one product
head(df)
## # A tibble: 6 x 3
## user_id prod_oid_paste n
## <int> <chr> <int>
## 1 1 2808 11971 2
## 2 2 10999 2173 2
## 3 3 2689 2686 2
## 4 4 17696 13367 2
## 5 5 18350 2716 2
## 6 6 17975 18576 2
retail.list <- df
#Seperate by ""
retail.list <- sapply(retail.list$prod_oid_paste,strsplit, " ")
head(retail.list)
## $`2808 11971`
## [1] "2808" "11971"
##
## $`10999 2173`
## [1] "10999" "2173"
##
## $`2689 2686`
## [1] "2689" "2686"
##
## $`17696 13367`
## [1] "17696" "13367"
##
## $`18350 2716`
## [1] "18350" "2716"
##
## $`17975 18576`
## [1] "17975" "18576"
The data has to be transfromed in to ‘transaction’ type in order to fit in the packages, arules, which we will explore later.
Therefore, we have to group data by user_id, and paste the orders together. Now, the original data.frame has transformed into list, and each row means a market basket ordered by certain customer.
require(arules)
## Loading required package: arules
## Loading required package: Matrix
##
## Attaching package: 'arules'
## The following object is masked from 'package:dplyr':
##
## recode
## The following objects are masked from 'package:base':
##
## abbreviate, write
retail.trans <- as(retail.list, "transactions")
summary(retail.trans)
## transactions as itemMatrix in sparse format with
## 222845 rows (elements/itemsets/transactions) and
## 5067 columns (items) and a density of 0.000498097
##
## most frequent items:
## 2674 2685 7423 8332 2173 (Other)
## 12115 11950 9853 9149 7985 511377
##
## element (itemset/transaction) length distribution:
## sizes
## 2 3 4 5 6 7 8 9 10 11
## 148855 47268 16913 6111 2241 899 331 114 54 22
## 12 13 14 15 16 17 23 26
## 14 10 5 2 2 2 1 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 2.000 2.524 3.000 26.000
##
## includes extended item information - examples:
## labels
## 1 10000
## 2 10005
## 3 10007
##
## includes extended transaction information - examples:
## transactionID
## 1 2808 11971
## 2 10999 2173
## 3 2689 2686
By transforming into transactions data and using summary function, we can see product ‘2674’ is the most frequent product which appeared in 12115 customers’ orders. And the median product in customers’ orders is 2 -> at least 50 % people have only two product in each order.
Depends on researcher’s experience and the purpose, we have to set three parameters in arules: confidence, support, and lift, to extract meaninful patterns.
Here we are going to set support, confidece as threshold, which is common in most research.
sup = 0.0001
conf = 0.1
retail.rules <- apriori(retail.trans, parameter=list(supp=sup, conf=conf))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.1 0.1 1 none FALSE TRUE 5 1e-04 1
## maxlen target ext
## 10 rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 22
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[5067 item(s), 222845 transaction(s)] done [0.08s].
## sorting and recoding items ... [1488 item(s)] done [0.01s].
## creating transaction tree ... done [0.14s].
## checking subsets of size 1 2 3 4 done [0.02s].
## writing ... [4400 rule(s)] done [0.00s].
## creating S4 object ... done [0.04s].
Knowing that there are thousands of products on KKday, we set a conservative threshould to secure that we could have enough patterns for recommendation. And we get 4400 association rules eventually.
# install.packages("arulesViz")
library(arulesViz)
## Loading required package: grid
arulesViz::plotly_arules(retail.rules)
## Warning: 'arulesViz::plotly_arules' is deprecated.
## Use 'plot' instead.
## See help("Deprecated")
## Warning: plot: Too many rules supplied. Only plotting the best 1000 rules
## using measure lift (change parameter max if needed)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
This interactive visualization tools can help us determine the parameters. By observing the distribution and the number of rules, we can see whether to increase the threshold or not.
retail.conf <- head(sort(retail.rules, by="confidence"), 20)
inspect(retail.conf)
## lhs rhs support confidence lift count
## [1] {12225} => {11359} 0.0001256479 1.0000000 1714.19231 28
## [2] {1446,2358,2768} => {2914} 0.0001032108 0.9583333 60.44715 23
## [3] {1859} => {1853} 0.0001435976 0.9411765 134.53269 32
## [4] {2791} => {2143} 0.0001256479 0.9032258 426.43931 28
## [5] {11912} => {9735} 0.0002468083 0.9016393 252.73688 55
## [6] {1446,2612,7423} => {2914} 0.0001076982 0.8888889 56.06692 24
## [7] {1862} => {1853} 0.0009872333 0.8627451 123.32164 220
## [8] {12784,2674} => {2685} 0.0002782203 0.8611111 16.05810 62
## [9] {1446,2768,7423} => {2914} 0.0002064215 0.8518519 53.73080 46
## [10] {1446,2878,7423} => {2914} 0.0001211604 0.8437500 53.21978 27
## [11] {1446,2930} => {2914} 0.0001615473 0.8372093 52.80722 36
## [12] {4239,4627,5260} => {5925} 0.0001121856 0.8333333 61.22788 25
## [13] {1446,2768,2878} => {2914} 0.0001076982 0.8275862 52.20024 24
## [14] {8416} => {8427} 0.0002019341 0.8181818 302.36771 45
## [15] {5260} => {5925} 0.0069958940 0.8140992 59.81469 1559
## [16] {1446,1922} => {2914} 0.0002243712 0.8064516 50.86717 50
## [17] {12822,2674} => {2685} 0.0001256479 0.8000000 14.91849 28
## [18] {1446,2612} => {2914} 0.0003006574 0.7976190 50.31005 67
## [19] {2479,2843} => {2312} 0.0001750095 0.7959184 83.62396 39
## [20] {2583,2674} => {2685} 0.0008840225 0.7943548 14.81322 197
By sorting the rules from highest confidence, we can see that the product ‘12225’ has 100% chance being bought together with 11359, yet this combination only has been bought for 28 times, which only count for 0.01% of total orders. On the other hand, Product ‘5260’ has 81% chance being bought together with ‘5925’, and ‘1559’ people have bought the same bundle. This means that we could to recommend ‘1559’ to any those customer who has bought ‘5925’.
rules_length <- lapply(LIST(retail.rules@lhs), function(x) unlist(strsplit(x, " ")))
retail_long <- head(retail.rules[order(lengths(rules_length),retail.rules@quality$confidence,decreasing = TRUE)],20)
inspect(retail_long)
## lhs rhs support confidence lift count
## [1] {1446,2358,2768} => {2914} 0.0001032108 0.9583333 60.44715 23
## [2] {1446,2612,7423} => {2914} 0.0001076982 0.8888889 56.06692 24
## [3] {1446,2768,7423} => {2914} 0.0002064215 0.8518519 53.73080 46
## [4] {1446,2878,7423} => {2914} 0.0001211604 0.8437500 53.21978 27
## [5] {4239,4627,5260} => {5925} 0.0001121856 0.8333333 61.22788 25
## [6] {1446,2768,2878} => {2914} 0.0001076982 0.8275862 52.20024 24
## [7] {13903,13952,17927} => {13900} 0.0001615473 0.7826087 71.06782 36
## [8] {2459,2843,4016} => {2312} 0.0001211604 0.7714286 81.05092 27
## [9] {13900,13903,17927} => {13952} 0.0001615473 0.7659574 73.10055 36
## [10] {2467,2843,4016} => {2312} 0.0001525724 0.7391304 77.65748 34
## [11] {13903,13952,7452} => {13900} 0.0001525724 0.7391304 67.11961 34
## [12] {11731,13903,13952} => {13900} 0.0002557832 0.7307692 66.36034 57
## [13] {4227,4627,5260} => {5925} 0.0001032108 0.7187500 52.80905 23
## [14] {2459,2467,2843} => {2312} 0.0001211604 0.6923077 72.73800 27
## [15] {2459,2467,4016} => {2312} 0.0002198838 0.6901408 72.51034 49
## [16] {2287,2685,8332} => {2674} 0.0001929592 0.6825397 12.55473 43
## [17] {13900,13903,17688} => {13952} 0.0001346227 0.6521739 62.24141 30
## [18] {17756,2674,8332} => {2685} 0.0001211604 0.6428571 11.98808 27
## [19] {11731,13900,17688} => {13952} 0.0001032108 0.6388889 60.97353 23
## [20] {11847,18608,2322} => {18073} 0.0001480850 0.6226415 20.75580 33
We can see that the first 4 patterns of rhs are product ‘2914’, meaning these products often bought together.
plot(retail.rules, method="graph", control=list(type="items"))
## Available control parameters (with default values):
## main = Graph for 100 rules
## nodeColors = c("#66CC6680", "#9999CC80")
## nodeCol = c("#EE0000FF", "#EE0303FF", "#EE0606FF", "#EE0909FF", "#EE0C0CFF", "#EE0F0FFF", "#EE1212FF", "#EE1515FF", "#EE1818FF", "#EE1B1BFF", "#EE1E1EFF", "#EE2222FF", "#EE2525FF", "#EE2828FF", "#EE2B2BFF", "#EE2E2EFF", "#EE3131FF", "#EE3434FF", "#EE3737FF", "#EE3A3AFF", "#EE3D3DFF", "#EE4040FF", "#EE4444FF", "#EE4747FF", "#EE4A4AFF", "#EE4D4DFF", "#EE5050FF", "#EE5353FF", "#EE5656FF", "#EE5959FF", "#EE5C5CFF", "#EE5F5FFF", "#EE6262FF", "#EE6666FF", "#EE6969FF", "#EE6C6CFF", "#EE6F6FFF", "#EE7272FF", "#EE7575FF", "#EE7878FF", "#EE7B7BFF", "#EE7E7EFF", "#EE8181FF", "#EE8484FF", "#EE8888FF", "#EE8B8BFF", "#EE8E8EFF", "#EE9191FF", "#EE9494FF", "#EE9797FF", "#EE9999FF", "#EE9B9BFF", "#EE9D9DFF", "#EE9F9FFF", "#EEA0A0FF", "#EEA2A2FF", "#EEA4A4FF", "#EEA5A5FF", "#EEA7A7FF", "#EEA9A9FF", "#EEABABFF", "#EEACACFF", "#EEAEAEFF", "#EEB0B0FF", "#EEB1B1FF", "#EEB3B3FF", "#EEB5B5FF", "#EEB7B7FF", "#EEB8B8FF", "#EEBABAFF", "#EEBCBCFF", "#EEBDBDFF", "#EEBFBFFF", "#EEC1C1FF", "#EEC3C3FF", "#EEC4C4FF", "#EEC6C6FF", "#EEC8C8FF", "#EEC9C9FF", "#EECBCBFF", "#EECDCDFF", "#EECFCFFF", "#EED0D0FF", "#EED2D2FF", "#EED4D4FF", "#EED5D5FF", "#EED7D7FF", "#EED9D9FF", "#EEDBDBFF", "#EEDCDCFF", "#EEDEDEFF", "#EEE0E0FF", "#EEE1E1FF", "#EEE3E3FF", "#EEE5E5FF", "#EEE7E7FF", "#EEE8E8FF", "#EEEAEAFF", "#EEECECFF", "#EEEEEEFF")
## edgeCol = c("#474747FF", "#494949FF", "#4B4B4BFF", "#4D4D4DFF", "#4F4F4FFF", "#515151FF", "#535353FF", "#555555FF", "#575757FF", "#595959FF", "#5B5B5BFF", "#5E5E5EFF", "#606060FF", "#626262FF", "#646464FF", "#666666FF", "#686868FF", "#6A6A6AFF", "#6C6C6CFF", "#6E6E6EFF", "#707070FF", "#727272FF", "#747474FF", "#767676FF", "#787878FF", "#7A7A7AFF", "#7C7C7CFF", "#7E7E7EFF", "#808080FF", "#828282FF", "#848484FF", "#868686FF", "#888888FF", "#8A8A8AFF", "#8C8C8CFF", "#8D8D8DFF", "#8F8F8FFF", "#919191FF", "#939393FF", "#959595FF", "#979797FF", "#999999FF", "#9A9A9AFF", "#9C9C9CFF", "#9E9E9EFF", "#A0A0A0FF", "#A2A2A2FF", "#A3A3A3FF", "#A5A5A5FF", "#A7A7A7FF", "#A9A9A9FF", "#AAAAAAFF", "#ACACACFF", "#AEAEAEFF", "#AFAFAFFF", "#B1B1B1FF", "#B3B3B3FF", "#B4B4B4FF", "#B6B6B6FF", "#B7B7B7FF", "#B9B9B9FF", "#BBBBBBFF", "#BCBCBCFF", "#BEBEBEFF", "#BFBFBFFF", "#C1C1C1FF", "#C2C2C2FF", "#C3C3C4FF", "#C5C5C5FF", "#C6C6C6FF", "#C8C8C8FF", "#C9C9C9FF", "#CACACAFF", "#CCCCCCFF", "#CDCDCDFF", "#CECECEFF", "#CFCFCFFF", "#D1D1D1FF", "#D2D2D2FF", "#D3D3D3FF", "#D4D4D4FF", "#D5D5D5FF", "#D6D6D6FF", "#D7D7D7FF", "#D8D8D8FF", "#D9D9D9FF", "#DADADAFF", "#DBDBDBFF", "#DCDCDCFF", "#DDDDDDFF", "#DEDEDEFF", "#DEDEDEFF", "#DFDFDFFF", "#E0E0E0FF", "#E0E0E0FF", "#E1E1E1FF", "#E1E1E1FF", "#E2E2E2FF", "#E2E2E2FF", "#E2E2E2FF")
## alpha = 0.5
## cex = 1
## itemLabels = TRUE
## labelCol = #000000B3
## measureLabels = FALSE
## precision = 3
## layout = NULL
## layoutParams = list()
## arrowSize = 0.5
## engine = igraph
## plot = TRUE
## plot_options = list()
## max = 100
## verbose = FALSE
The network graph shows associations between selected products. Larger circles imply higher support, while red circles imply higher lift.
The most popular order was of ‘2674’ and ‘2685’, another popular orders was of ‘2689’ and ‘2685’
Relatively many people buy ‘5260’ along with ‘sliced cheese’5925’ (1559 times)
#-------------Form Function for product recommendation--------------
#Test any basket you like
new_basket = c('7781')
next_buy = function(new_basket){
it_new_basket = as(list(new_basket), "itemMatrix")
# find all rules, where the lhs is a subset of the current new_basket
rulesMatchLHS <- is.subset(retail.rules@lhs,it_new_basket)
# and the rhs is NOT a subset of the current new_basket (so that some items are left as potential recommendation)
suitableRules <- rulesMatchLHS & !(is.subset(retail.rules@rhs,it_new_basket))
possible_recomed = retail.rules[as.logical(suitableRules)]
if(length(possible_recomed)==0){
print('No association rules pass the threshold, consider other possible combination ')
}else{
# now extract the matching rhs ...
# recommendations <- strsplit(LIST(possible_recomed@lhs)[order(lengths(lst1), decreasing = TRUE)][[1]],split=" ")
lst1 <- lapply(LIST(possible_recomed@lhs), function(x) unlist(strsplit(x, " ")))
# lst2 <- order(possible_recomed@quality$confidence,decreasing = TRUE)
# LIST(possible_recomed@lhs)[order(lengths(lst1), decreasing = TRUE)]
recommendations <- strsplit(LIST(possible_recomed@rhs)[[order(possible_recomed@quality$confidence, decreasing = TRUE)[1]]],split=" ")
print("Potential recommendations are...")
inspect(possible_recomed[order(possible_recomed@quality$confidence, decreasing = TRUE),])
recommendations <- lapply(recommendations,function(x){paste(x,collapse=" ")})
recommendations <- as.character(recommendations)
print(paste("Best recommendation would be ",recommendations))
return(as.character(recommendations))
}
}
target_one = 11731 rules<-apriori(data=retail.trans, parameter=list(supp=sup,conf = conf,minlen=2), appearance = list(default=“lhs”,rhs=target_one), control = list(verbose=F)) rules<-sort(rules, decreasing=TRUE,by=“confidence”) inspect(rules) next_buy(target_one)
next_buy(“11731”) basket = c(“2014”,“2674”) next_buy(basket)
DT_filter = data
product_list <- unique(DT_filter$prod_oid)
user_list <- unique(DT_filter$user_id)
product_list_len <- length(product_list)
user_list_len <- length(user_list)
prod_len <- c(1:product_list_len)
prod_user_temp <- Matrix(rep(0,user_list_len), nrow = 1)
prod_user <- Matrix(rep(0,user_list_len), nrow = 1)
prod_user <- Matrix(0, nrow =product_list_len ,ncol = user_list_len)
# for(i in 2 : product_list_len){
# prod_user <<- rbind(prod_user, prod_user_temp)
# }
colnames(prod_user) <- user_list
temp <- sapply(prod_len, function(x){
DT_filter_i_user <- DT_filter %>%
filter(prod_oid == product_list[x]) %>%
select(user_id)
DT_filter_i_user_v <- as.vector(t(DT_filter_i_user))
prod_user[x,DT_filter_i_user_v] <<- 1
rm(DT_filter_i_user, DT_filter_i_user_v)
if(x %% 500 == 0){
print(paste0(x,'/',product_list_len))
}
})
## [1] "500/5067"
## [1] "1000/5067"
## [1] "1500/5067"
## [1] "2000/5067"
## [1] "2500/5067"
## [1] "3000/5067"
## [1] "3500/5067"
## [1] "4000/5067"
## [1] "4500/5067"
## [1] "5000/5067"
print("prod_user matrix finish")
## [1] "prod_user matrix finish"
rm(DT_filter,temp)
DT_filter_matrix <- prod_user
row.names(DT_filter_matrix) <- product_list
rm(prod_user)
DT_rowSums <- rowSums(DT_filter_matrix)
DT_rowSums_under_5 <- names(DT_rowSums)[DT_rowSums<=5]
rm(DT_rowSums)
print("start calculating cosine_score_matrix")
## [1] "start calculating cosine_score_matrix"
xxt <- DT_filter_matrix %*% t(DT_filter_matrix)
diag_xxt <- sqrt(diag(1/diag(xxt)))
score_matrix <- diag_xxt %*% xxt %*% diag_xxt
rownames(score_matrix) <- rownames(DT_filter_matrix)
colnames(score_matrix) <- rownames(DT_filter_matrix)
rm(xxt,diag_xxt)
score_matrix <- as.matrix(score_matrix)
DT_similar_prod <- data.frame(similar_prod_oid = rep(NA_real_,product_list_len*20),
score = rep(NA_real_,product_list_len*20),
prod_oid = rep(NA_real_,product_list_len*20))
t1 <- Sys.time()
temp <- sapply(prod_len, function(x){
DT_similar_prod[(20*x-19):(20*x),] <<-
data.frame(score = score_matrix[x,],
similar_prod_oid = product_list)[-x,] %>%
arrange(desc(score)) %>%
mutate(score_order = c(1:(product_list_len-1))) %>%
filter(score_order <=20) %>%
select(similar_prod_oid, score) %>%
mutate(prod_oid = product_list[x])
if(x %% 500 == 0){
print(paste0(x,'/',product_list_len))
}
} )
## [1] "500/5067"
## [1] "1000/5067"
## [1] "1500/5067"
## [1] "2000/5067"
## [1] "2500/5067"
## [1] "3000/5067"
## [1] "3500/5067"
## [1] "4000/5067"
## [1] "4500/5067"
## [1] "5000/5067"
rm(temp)
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 2672580 142.8 4703850 251.3 4703850 251.3
## Vcells 33402184 254.9 106178485 810.1 110423644 842.5
print("End of calculation cosine_score_matrix")
## [1] "End of calculation cosine_score_matrix"
#Apriori result
#Recommendation after buying
target_one = c(1446)
next_buy(target_one)
## [1] "Potential recommendations are..."
## lhs rhs support confidence lift count
## [1] {1446} => {2914} 0.0060131482 0.7600681 47.941514 1340
## [2] {1446} => {7423} 0.0011846799 0.1497448 3.386773 264
## [3] {1446} => {2768} 0.0010231327 0.1293250 7.054941 228
## [4] {1446} => {12245} 0.0008750477 0.1106069 19.750160 195
## [1] "Best recommendation would be 2914"
## [1] "2914"
#Cosine result
head(DT_similar_prod[DT_similar_prod$prod_oid==target_one,],10)
## similar_prod_oid score prod_oid
## 361 2914 0.53691659 1446
## 362 12245 0.13146228 1446
## 363 11293 0.09779082 1446
## 364 2768 0.08495964 1446
## 365 7423 0.06334226 1446
## 366 2878 0.06225813 1446
## 367 2881 0.05335821 1446
## 368 11866 0.04767070 1446
## 369 2603 0.04125782 1446
## 370 2948 0.04000302 1446
#Reverse recommendation: Product lead to this product
rules<-apriori(data=retail.trans, parameter=list(supp=sup,conf = conf,minlen=2),
appearance = list(default="lhs",rhs=target_one),
control = list(verbose=F))
rules<-sort(rules, decreasing=TRUE,by="confidence")
head(inspect(rules))
## lhs rhs support confidence lift count
## [1] {2612,2914,7423} => {1446} 0.0001076982 0.5714286 72.22915 24
## [2] {11293} => {1446} 0.0001391101 0.5438596 68.74441 31
## [3] {2768,2914,7423} => {1446} 0.0002064215 0.4339623 54.85327 46
## [4] {2881,2914} => {1446} 0.0003589939 0.4145078 52.39421 80
## [5] {12245,2768} => {1446} 0.0001166730 0.4126984 52.16550 26
## [6] {2813,2914} => {1446} 0.0001121856 0.4032258 50.96815 25
## [7] {2878,2914,7423} => {1446} 0.0001211604 0.3913043 49.46127 27
## [8] {2914} => {1446} 0.0060131482 0.3792811 47.94151 1340
## [9] {2603,2914} => {1446} 0.0001929592 0.3739130 47.26299 43
## [10] {2612,2914} => {1446} 0.0003006574 0.3701657 46.78933 67
## [11] {2914,7423} => {1446} 0.0009199219 0.3422371 43.25911 205
## [12] {2878,2914} => {1446} 0.0004711795 0.3343949 42.26786 105
## [13] {2914,7452} => {1446} 0.0001435976 0.3106796 39.27022 32
## [14] {1922,2914} => {1446} 0.0002243712 0.2941176 37.17677 50
## [15] {2768,2878,2914} => {1446} 0.0001076982 0.2926829 36.99542 24
## [16] {1919,2914} => {1446} 0.0001435976 0.2857143 36.11458 32
## [17] {2768,2914} => {1446} 0.0007673495 0.2722930 34.41811 171
## [18] {2358,2768,2914} => {1446} 0.0001032108 0.2705882 34.20263 23
## [19] {2914,2948} => {1446} 0.0002468083 0.2350427 29.70964 55
## [20] {2914,2930} => {1446} 0.0001615473 0.2307692 29.16947 36
## [21] {2358,2914} => {1446} 0.0003589939 0.2173913 27.47848 80
## [22] {11866} => {1446} 0.0001121856 0.1602564 20.25657 25
## [23] {12245} => {1446} 0.0008750477 0.1562500 19.75016 195
## [24] {2716,2914} => {1446} 0.0002871951 0.1406593 17.77948 64
## lhs rhs support confidence lift count
## [1] {2612,2914,7423} => {1446} 0.0001076982 0.5714286 72.22915 24
## [2] {11293} => {1446} 0.0001391101 0.5438596 68.74441 31
## [3] {2768,2914,7423} => {1446} 0.0002064215 0.4339623 54.85327 46
## [4] {2881,2914} => {1446} 0.0003589939 0.4145078 52.39421 80
## [5] {12245,2768} => {1446} 0.0001166730 0.4126984 52.16550 26
## [6] {2813,2914} => {1446} 0.0001121856 0.4032258 50.96815 25
head(DT_similar_prod[DT_similar_prod$similar_prod_oid==target_one,],10)
## similar_prod_oid score prod_oid
## 352 1446 0.08495964 2768
## 401 1446 0.53691659 2914
## 710 1446 0.06225813 2878
## 1002 1446 0.13146228 12245
## 3200 1446 0.03536536 2716
## 3471 1446 0.05335821 2881
## 4534 1446 0.04000302 2948
## 4600 1446 0.01136833 4406
## 4615 1446 0.04125782 2603
## 5095 1446 0.06334226 7423
d = data.frame( data, stringsAsFactors = FALSE ) data <- datatable(d, filter = ‘bottom’, options = list(pageLength = 5)) data